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Abstract 

Despite the common assumption that orthologs usually share the same function, there have been various reports of 
divergence between orthologs, even among species as close as mammals. The comparison of mouse and human is 
of special interest, because mouse is often used as a model organism to understand human biology. We review the 
literature on evidence for divergence between human and mouse orthologous genes, and discuss it in the context 
of biomedical research. 
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INTRODUCTION 

The mouse Mus musculus is the most widely used 
model organism to understand human biology. 
Relative to other mammals, and many other verte- 
brates, mice have fast reproduction, short life spans, 
are not expensive, easy to handle and can be 
manipulated at the molecular level [1]. There are 
almost 400 000 publications in PubMed with 
'mouse' (or 'mice' or 'murine') in the title, second 
only to human (~700 000 publications with 'human' 
in the title). In addition to sharing the mammalian 
body plan, human and mouse have a median of 
78.5% amino acid sequence identity [2]. In a first 
approximation, it seems reasonable to expect genes 
to have conserved function between human and 
mouse, both normal and pathological. This expect- 
ation is usually applied to orthologs. The definition 
of orthology is formally based on evolutionary cri- 
teria, but is often taken to imply functional conser- 
vation (discussed in Refs [3, 4]), especially for 
one-to-one orthologs. 

The assumption of conserved function between 
orthologs has been supported even between rela- 
tively distant species, by observations of conserved 
phenotypic effects when orthologs were subject to 



knock-in experiments [5, 6] or in situ [7, 8], clarifying 
the role of genes involved in human diseases. 
Yet there is also some evidence of differential pheno- 
typic effects [9]. In this review, we consider some 
sources of variation of ortholog function between 
human and mouse, especially in the context of bio- 
medical research. We do not consider other sources 
of human— mouse differences, such as the emergence 
of novel genes [10]. 

In the specific case of humans and mice, while 
both species are placental mammals and share many 
common anatomical features and physiological pro- 
cesses, there are also a number of biological differ- 
ences, which should be expected to translate into 
differences between orthologous genes, especially 
considering a divergence time of ~100Mya 
[11, 12]. Rodents are notably small, specialized for 
gnawing, and have a high rate of reproduction [13], 
unlike primates. Mus musculus has an average weight 
of 12— 30 g, sexual maturity at 1.5 months and up to 
10 litters per year [14]. Probably related to the dif- 
ferences in life history, mice genomes have evolved 
faster than those of primates [2, 15]. 

Here, we first provide a few examples of experi- 
mentally determined divergence between human 
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and mouse orthologs, to illustrate that the existence 
of such differences are not to be dismissed as simply 
mistakes in genomic studies. We try to relate these 
examples to knowledge which can be derived from 
genomic databases. Then we discuss the evidence 
from comparative large-scale studies, concerning 
the frequency of differences between human and 
mouse orthologs. Of note, the 'function' of a gene 
does not have an unambiguous definition, and we 
have tried here to stay as close as possible to the 
aspects which are relevant to the use of mice as bio- 
medical model organisms. Moreover, given that this 
question has been explicitly raised relatively recently, 
we are aware that we are presenting a still very in- 
complete view, which we hope will be enriched by 
future comparative studies. 

EXAMPLES OF DIVERGENCE IN 
GENE FUNCTION BETWEEN 
HUMAN AND MOUSE 

TDP1 is a gene that participates in the repair of Topo 
I— DNA complexes. The intra-cellular expression 
localizations of TDP1 orthologs in human and 
mouse have been determined to be in the cytoplasm 
and in the nucleus, respectively [16]. The mutation 
TDP1 1478 A>G in humans is linked to SCAN1 dis- 
order, characterized by 'ataxia, cerebellar atrophy, 
and peripheral neuropathy', whereas there is no 
clear phenotype for this mutation in this mouse 
ortholog [16]. There are no obvious differences in 
gene expression patterns (as reported in Bgee, Ref. 
[17]), nor evidence of positive selection on the pri- 
mary sequence (as reported in Selectome, Ref. [18]) 
between human and mouse. The intracellular ex- 
pression localization of TDP1 in human and mouse 
thus seems to result in different phenotypes. 

While the molecular basis of inflammation is 
mostly conserved among mammals, the role of the 
two selectins, P and E, differs between human and 
mouse. The human ortholog of mouse P-selectin has 
lost the standard mammalian regulatory pathway. 
Notably, human P-selectin is not responsive to 
TNF (Tumor necrosis factor), a major inflammatory 
factor, a difference with major effects on the rolling 
of leukocytes in vivo, and on the contribution to in- 
flammation [19]. There also seems to be a decreased 
role of human P-selectin in contact hypersensitivity. 
As Liu et al. [19] conclude, their 'results underscore 
the need for caution in extrapolating the functions of 
P-selectin obtained in mice to humans, particularly 



in the many models where mediators are generated 
that activate NF-kB— and ATF-2— dependent genes'. 
Interestingly, P-selectin is often associated in the 
biomedical literature to thymus activity [20], but 
the evidence seems derived from mouse models. 
Transcriptome data (as reported in Bgee) support 
expression of P-selectin in the thymus in mouse, 
but not in human, so it is possible that this role 
also is not conserved between the orthologs. 

LEFTY is a locus that includes two genes, 
LEFTY1 and LEFTY2, which arose by independent 
duplications in rodents and in primates (thus, human 
LEFTY1 and mouse leftyl are not one-to-one ortho- 
logs, despite the names). In both mouse and human, 
the LEFTY genes are involved in the establishment 
of asymmetry during development. There is some 
evidence for positive selection on Leftyl in mouse 
and rat (reported in Selectome based on 
TreeFam 7), and there is experimental evidence 
that the molecular function is carried out differently 
in human and mouse [21]. Notably, it seems that the 
asymmetric expression patterns in development are 
controlled differently in human and mouse [21]. 
Thus, similar global functions are carried out by 
orthologs, but with differences in the specifics of pro- 
tein sequence and expression pattern. Interestingly, 
Yashiro et al. [21] point out that there are also many 
specific differences in anatomical asymmetry between 
human and mouse, which might be related to these 
differences in LEFTY /Lefty function. 

LARGE-SCALE QUANTITATIVE 
EVIDENCE FOR DIVERGENCE 
Expression divergence 

The examples above show that divergence of func- 
tion between human and mouse orthologs can be 
mediated by gene expression regulation. While the 
same level of mechanistic details cannot be provided 
in genomic studies, it is interesting in this context to 
evaluate the scale of expression divergence between 
human and mouse orthologs. 

The study of the evolution of gene expression is 
hampered by the difficulty of distinguishing experi- 
mental noise from bona fide functional divergence. In 
a careful study comparing relative expression profiles 
between human and mouse orthologs, Liao and 
Zhang [22] reanalyzed the GNF dataset of human 
and mouse microarrays [23]. They found that after 
correcting for experimental variation, only 16% of 
orthologs between human and mouse had expression 
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profiles as divergent as random pairs. Housekeeping 
orthologous genes appear to diverge more in expres- 
sion than tissue-specific genes [22, 24]. Conservation 
of expression patterns between human— mouse 
tissue-specific orthologs has been confirmed by an 
alternative experimental approach [25], but without 
any specific quantification of divergent orthologs. 

Three points should be noted about these results. 
Firstly, even 16% of orthologs is clearly above the 5% 
accepted false positive rate of the randomization 
method, which indicates that changes in expression 
pattern between human and mouse are not very rare 
(as previously noted in Ref. [4]). Secondly, the other 
84% of genes are more conserved than a random 
expectation, but might still diverge in functionally 
relevant ways. Thirdly, Liao and Zhang [22] and 
other related studies have mostly used the Pearson's 
correlation coefficient as a measure of gene expres- 
sion conservation, whereas this is biased especially for 
housekeeping genes [26] (B. Piasecka etal, unpub- 
lished data). Of note, an alternative measure, the 
'Gene expression barcode' [27], which detects 
organ specific overexpression of genes, recovers 
also a good conservation of organ-specific expression 
between human and mouse orthologs; but a more 
detailed quantification is not provided. 

Thus, it appears that the changes of expression 
pattern found in small-scale studies do not represent 
very rare evolutionary events, but rather that diver- 
gence by expression is a relatively common phenom- 
enon between human and mouse orthologs. 

Gene isofbrms 

Alternative splicing is very frequent in human and 
mouse genes. A methodological consequence is that, 
as gene orthology prediction is mostly based on se- 
quence similarity, orthologous genes can be errone- 
ously inferred by grouping the wrong gene isoforms, 
which might have dissimilar functions. From a more 
fundamental perspective, many differences in splicing 
patterns have been reported between human and 
mouse orthologs [28, 29]. If a significant proportion 
of these splice forms have functional roles, then this 
provides a potential path for functional divergence 
between the orthologs. 

In one study, >11% of human-mouse alternative 
cassette exons were found to be subject to exon 
skipping in one organism, yet consecutively spliced 
in the other [29]. Non-conserved exons between 
human and mouse are mostly found outside the 
coding sequences, suggesting that when non- 



conserved exons are localized within coding 
sequences, it might be due to species-specific func- 
tional effects [30]. In a more recent study, orthology 
at the gene level was distinguished from orthology 
at the transcript level (conservation of exon struc- 
ture) [31]. Even using relaxed criteria for transcript 
orthology, 13% of human-mouse orthologous genes 
have non-oithologous transcripts [31]. This level of 
divergence, if it is confirmed, is of the same scale as the 
divergence observed at the expression level. The gain 
of splice forms has been shown to be a continuous 
process in human and mouse evolution [32], which 
certainly provides material for functional divergence. 

The phenomenon of alternative promoters regu- 
lating different gene isoforms is related both to 
changes in expression and to changes in transcript 
structure. Sequence comparison between human- 
mouse alternative promoters shows not only rather 
low sequence conservation during evolution, but 
especially that the subsets of conserved and 
non-conserved alternative promoters can be distin- 
guished clearly [33]. For example, the human 
ACACB gene has two alternative promoters. Only 
one of those promoters is highly conserved in 
rodents, while both promoters actively regulate 
the skeletal muscle ACACB gene function in 
humans [34]. 

Differences in gene copy number 

Approximately 9% of orthologs are duplicated either 
in human, or mouse, or both independently, as was 
the case for LEFTY (estimated as the proportion of 
non one-to-one orthologs among orthologs in 
Ensembl Compara [35]). In most of these cases, iden- 
tifying which ortholog is expected to share the func- 
tion between species is difficult. Moreover, positive 
selection appears to affect strongly these lineage- 
specific duplicates [36], which might imply changes 
in biochemical function. 

Not only are genes duplicated in the human and 
mouse lineages, but copy number variations (CNVs) 
are widely observed among human and mouse gen- 
omes. These can result from local alterations, such as 
duplications, deletions, translocations or inversions. 
In humans, CNVs have been shown to be medically 
relevant, e.g. linked to the reaction to cancer treat- 
ment [37]. In mice, CNVs have a significant impact 
on the measure of gene expression [38]. CNVs 
appear to affect a biased subset of the genome. 
Human CNVs are enriched in protein coding 
genes with high synonymous and non-synonymous 
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divergence to their mouse orthologs [39]. These 
genes are associated with olfaction, immunity and 
protein secretion. Mouse CNVs, on the other 
hand, seem to have decreased amino acid sequence 
divergence [39]. 

These variations, and the differences in the genes 
affected, render the definition of one-to-one orthol- 
ogy more complex between human and mouse. It is 
possible to have one-to-one orthologs for some in- 
dividuals, but not for others. If the copy variants have 
differences in function (e.g. different expression 
levels), then orthologs might have functional conser- 
vation in some individuals but not others. The study 
of CNVs is mostly recent, and the functional and 
medical consequences remain to be elucidated in 
more detail. But we can already suggest that, parallel 
to the recently introduced concept of 'splicing 
orthology' [31], we might need to define a concept 
of 'copy number orthology', restricted to orthologs 
with the same number of copies in both organisms. 
Consistent with the original evolutionary definition 
of orthology, it would probably be best to restrict 
this further to the most probable ancestral copy 
number, whose function was probably conserved. 

Phenotypic divergence 

Gene— phenotype relations can be complex, and dif- 
ferent between species. For example, the alteration 
of GSK3 perturbs nutrient and stress signaling in 
yeast, anteroposterior patterning and segmentation 
in insects, dorsoventral patterning in frogs and cra- 
niofacial morphogenesis in mice [40, 41]. Obviously, 
predicting its phenotypic implication in human is not 
straightforward. Therefore, the relation of gene 
function to phenotype prediction between organisms 
is a difficult task. 

Several cases of single genes linked to human dis- 
eases show apparently normal mouse phenotypes 
when experimentally manipulated. For example, 
BCL10, SGCA and PKLR are linked to different 
human diseases when mutated (from OMIM [42]), 
whereas they present no phenotypic effect in mouse. 
This indicates that there are several pathogenic 
human mutations that have become fixed in mouse 
evolution [43]. 

Liao and Zhang [44] showed that >20% of 
human essential genes are mouse non-essential, and 
that the rate of evolution of those 20% is significantly 
higher than for the human— mouse essential. Gene 
essentiality is an extreme case of phenotypic 
impact, yet orthologous human and mouse essential 



genes can result in different phenotypes. For ex- 
ample, Adamts2, Acoxl and Fancg are essential for 
human [45, 46] and mouse [47, 48] but show differ- 
ent phenotypic effect when mutated (discussed in 
Ref. [44]). This finding shows a high rate of func- 
tional divergence between human-mouse orthologs. 
Recently, a review of 'phenologs', phenotypes asso- 
ciated to orthologous genes, showed that different 
phenotypes might correspond to deeper functional 
homology [49]. Such research might help to identify 
genes implicated in human disease, despite pheno- 
typic divergence between orthologs. 

CONCLUSION 

This review is per force quite limited, because a sys- 
tematic exploration of functional differences be- 
tween orthologs has only come on the agenda of 
biological research recently [4, 50]. We believe that 
both small-scale and large-scale studies provide evi- 
dence that functional divergence between human 
and mouse orthologs, although a minority phenom- 
enon, still affects a significant proportion of genes. 
Divergence of gene expression, of alternative spli- 
cing, and of mutant phenotypes, each affect of the 
order of 10— 20% of ortholog pairs, under conserva- 
tive estimates. If these and other different processes 
affect different genes, then it might be a majority of 
genes which are affected. But even if the same genes 
differ in expression pattern, splicing, etc., then 
having ~15% of human-mouse orthologs with 
strong differences will affect many pathways and bio- 
logical processes of interest. We look forward to 
future explorations of this topic, preferably combin- 
ing high-quality experimental data and large-scale 
approaches. 
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Key Points 

• Significant divergence in expression between human and mouse 
orthologs. 

• High divergence of alternative splicing between human and 
mouse orthologs. 

• Fast evolution of genes with copy number variants in human. 

• Significant divergence in gene-phenotype relations between 
human mouse orthologs. 

• This divergence is relevant to biomedical research using mouse. 
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